A Comparison of WordNet and Roget's Taxonomy for Measuring Semantic Similarity

نویسنده

  • Michael Mc Hale
چکیده

This paper presents the results of using Roget’s International Thesaurus as the taxonomy in a semantic similarity measurement task. Four similarity metrics were taken from the literature and applied to Roget’s. The experimental evaluation suggests that the traditional edge counting approach does surprisingly well (a correlation of r=0.88 with a benchmark set of human similarity judgements, with an upper bound of r=0.90 for human subjects performing the same task.)

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparison of WordNet and Roget's Taxonomy for Measuring Semantic Similarity

This paper presents the results of using Roget's International Thesaurus as the taxonomy in a semantic similarity measurement task. Four similarity metrics were taken from the literature and applied to Roget's. The experimental evaluation suggests that the traditional edge counting approach does surprisingly well (a correlation of r=0.88 with a benchmark set of human similarity judgements, with...

متن کامل

Evaluation of Automatic Updates of Roget's Thesaurus

abstract Keywords: lexical resources, Roget's Thesaurus, WordNet, semantic relatedness, synonym selection, pseudo-word-sense disambiguation, analogy Thesauri and similarly organised resources attract increasing interest of Natural Language Processing researchers. Thesauri age fast, so there is a constant need to update their vocabulary. Since a manual update cycle takes considerable time, autom...

متن کامل

Exploring Noun-Modi er Semantic Relations

We explore the semantic similarity between base noun phrases in clusters determined by a comprehensive set of semantic relations. The attributes that characterize modiiers and nouns are extracted from WordNet and from Roget's Thesaurus. We use various machine learning tools to nd combinations of attributes that explain the similarities in each category. The experiments gave promising results, w...

متن کامل

Automatic Construction of Persian ICT WordNet using Princeton WordNet

WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...

متن کامل

Roget's thesaurus and semantic similarity

Roget’s Thesaurus has not been sufficiently appreciated in Natural Language Processing. We show that Roget's and WordNet are birds of a feather. In a few typical tests, we compare how the two resources help measure semantic similarity. One of the benchmarks is Miller and Charles’ list of 30 noun pairs to which human judges had assigned similarity measures. We correlate these measures with those...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998